Reporting on Covid-19 in Arkansas: Data Journalism
Jour 5283, Fall 2020
Remote Delivery, Monday-Wednesday 9:40 a.m.-10:55 a.m.

Rob Wells, Ph.D.

@rwells1961

Syllabus - Jour 5283: “CNTL” + click for a New Tab

Course GitHub site: “CNTL” + click for a New Tab


Inspiration:


Week #1: Course Tools, R Basics

Monday, Aug. 24

Agenda:

--Discuss syllabus
--Blackboard site
--Teams
--Intellectual Property / Data Sharing Releases
--Arkansascovid.com
--Intro R and R Studio. Open program.
--Read Machlis, Ch. 2
--Advanced learners, see below

Teams

  For the first class, please have Teams installed so we can do some exercises.    
  Teams is free through your university Office365 account. Download the Teams App through the Office365 suite.

https://its.uark.edu/communication-collaboration/office365/office365-desktop-apps.php

Teams Videos

  Microsoft Teams allows us to easily share information through the class or in discrete groups.
  
  Chat in Teams

https://www.microsoft.com/en-us/videoplayer/embed/RE4rLgJ?pid=ocpVideo5-innerdiv-oneplayer&postJsllMsg=true&maskLevel=20&market=en-us

  Create a post

https://www.microsoft.com/en-us/videoplayer/embed/RE2BIrO?pid=ocpVideo0-innerdiv-oneplayer&postJsllMsg=true&maskLevel=20&market=en-us

  How to tag a person in Teams

https://www.microsoft.com/en-us/videoplayer/embed/RWkJ9C?pid=ocpVideo0-innerdiv-oneplayer&postJsllMsg=true&maskLevel=20&market=en-us

R and R Studio

Install R and R Studio.

This is free and open source software. It is not large and doesn't tax the memory a lot. 
R runs on Windows, Mac and Linux, but this course is designed for the Mac version. 
If you use Windows, there may be variations in the lessons and instructions. Please see me      for questions.

Installing R is a two-step process: 
1) Install R, the actual program
2) Install RStudio, a common interface 

1) Download the most recent version of R for Mac:         

https://mirrors.nics.utk.edu/cran/bin/macosx/R-4.0.2.pkg

--If you have a Windows computer, go to: 

https://mirrors.nics.utk.edu/cran/bin/windows/base/R-4.0.2-win.exe

Accept all of the default settings for Mac.

 2) Install RStudio, the interface we use to manage and create R code. Download the open         source edition of R Studio desktop and follow the prompts to install it.   

https://rstudio.com/products/rstudio/download/#download

 Good instructions for installing R

http://www.machlis.com/R4Journalists/download-r-and-rstudio.html

Good overview of the program

https://docs.google.com/presentation/d/1O0eFLypJLP-PAC63Ghq2QURAnhFo6Dxc7nGt4y_l90s/edit#slide=id.p

Intellectual Property / Data Sharing Releases

UofA Rules of Conduct

https://docs.google.com/document/d/1YkdkRIzIs1WQ3P9KIICvHcppfWvwTyo2bRhGwQPsgVE/edit

License Agreement

https://docs.google.com/document/d/1AahzxDOzTf9Z6PBjBvFBOjnn9_BiM4YldnXW-mHZr9s/edit

Read Before Wednesday’s Class

    Machlis, Sharon. Practical R for Mass Communications and Journalism. Chapman & Hall/CRC The R Series. 2018. ISBN 9781138726918 https://www.amazon.com/gp/search?keywords=9781138726918
    Ch. 1, Introduction; Ch. 3, See How Much You Can Do in a Few Lines of Code
   Chapter 2: Get Started With R in a Few Easy Steps
   
   Here is the link to the firts six chapters for free
   

http://www.machlis.com/PracticalR4Journalism/index.html (first six chapters are free)

   Basic tutorial on R:
   

https://profrobwells.github.io/Guest_Lectures/Intro_To_R/R1_Intro-to-R.html


CORRECT LINK: Download the code and try it yourself

  (Left click, download, remove.txt extension)

https://github.com/profrobwells/CovidFall2020/raw/master/Exercises/R1_Intro%20to%20R.Rmd

Week #2: File management

Monday, Aug. 31

Agenda

Arkansascovid.com data
Building Your Own R Markdown Files
Tuesday Quiz on Basic R functions

Exercise

Arkansascovid.com exercise Lesson #1 
Download this tutorial to work with Arkansascovid.com data

https://github.com/profrobwells/CovidFall2020/blob/master/Exercises/Arkansas%20Covid%20First%20Lesson%208-20-2020.Rmd

Quiz #1: R Terms. See Blackboard. Due 11:59 p.m. Tuesday Sept. 1

See Blackboard:

https://learn.uark.edu/webapps/assessment/take/launchAssessment.jsp?course_id=_276555_1&content_id=_8816418_1&mode=cpview

Read Before Next Class

Machlis. Ch 4: Import Data into R  
Cohen, "Numbers in the Newsroom," Common Mistakes. 
Basics of Data Analysis (On Blackboard)

Wednesday, Sept. 2

Agenda

Discuss Quiz on Basic R functions
Arkansascovid.com exercise Lesson #1
Discuss Ch 4, Machlis: See Notes
  • Ch 3 & 4 of Machlis: Key Points

    Ch 3 Exercises: Stock chart exercise used quantmod is a library for financial analysis. Median Income for a City Loading packages

    Ch 4 Importing Data How read.table() works for importing data:
    Loading data Manipulating data: dplyr - stringr Data Management: mutate rename bind_rows

Read Before Next Class

Machlis, Ch. 5: Basic Data Exploration

Beginner's guide to R:

https://www.computerworld.com/article/2497143/business-intelligence/business-intelligence-beginner-s-guide-to-r-introduction.html


Week #3: Data Types. Loading Data

Monday, Sept. 7.

Happy Labor Day! No Class

Read Before Next Class

 Machlis, Ch. 6: Beginning data visualization
 Video: Basic Charts in R  

https://www.youtube.com/watch?v=1EUJ0tsVsUA&t=12s

Refresher: Numbers in the Newsroom (On Blackboard)

https://learn.uark.edu/webapps/blackboard/content/listContent.jsp?course_id=_276555_1&content_id=_8815471_1

Wednesday, Sept. 9

Agenda

Describe Assignment #1 
Continue with R tutorial
Exercise: Loading Data from U.S. Census & Student Loans
top_n function makes life easy  
Sorting

Assignment #1

  Due Sept. 14: Managing Data / Static Graphic 
  Static Graphic - Managing Data in R.  
  Students will use R Studio to gather, analyze and visualize Arkansascovid data by                  demographic for Arkansas and report and write a 600 word story. 
  • Exercise

    Downloading Data 3rd Lesson 3 8-21-2020.Rmd
    Check what software packages are running: Global Environment
    Navigation: 

    ^ + shift + 8 = Zoom to Environment
    AP Style on numbers: AP Numerals Entry.pdf

Notes

 --The pie chart focuses the reader on large percentages, and encourages the reader to think of the total
--The stacked bar plot provides the same information, but makes it easier to accurately determine at a glance how large each group is out of the whole.
--This bar chart splits the categories horizontally, and draws attention to how the family members are ordered. It encourages the reader to think about the distribution rather than disconnected categories, and gives a better sense of sense of scale.     
  

Read Before Next Class

Wong, Dona M. The Wall Street Journal Guide to Information Graphics. 
Ch. 2: Chart Smart

Charts_with_ggplot by Andrew Ba Tran

Grammar of Graphics

http://vita.had.co.nz/papers/layered-grammar.html


Week #4: Visualizatiion

Monday, Sept. 14

Agenda

Data Visualization
ggplot2 - charts and maps
Export Static chart 
Discuss Ch 5, Machlis: See Notes

Assignment #1: Static Graphic - Managing Data in R. Due Sept. 14

Students will use R Studio to gather, analyze and visualize Arkansascovid data by demographic for Arkansas and report and write a 600 word story.

Exercise

  Tutorial: Graphing Introducton Jan 11 2020.R
  • GGPLOT

    A handy explanation of ggplot and its components

If you’re using ggplot: plus it!
For everything else: pipe it!

aes - reorder in ggplot 
 
geom_point() 
geom_bar()
geom_boxplot() 
  • Data Visualization Intro

    Load tutorial: Basic Data Visualization 12-26-18.R

Download this file and open it in R Studio

Read Before Next Class

Wong, Dona M. The Wall Street Journal Guide to Information Graphics. 
Ch. 3 & 4: Ready Reference and Tricky Situations

Prof. Alberto Cairo on Data Visualization    

https://video.uark.edu/media/Alberto+Cairo+Data+Viz/0_10pblm1b


Wednesday, Sept. 16

Agenda

Data Visualization
ggplot2 - charts and maps


> [**GGplot Video from Andrew Ba Tran**](https://www.youtube.com/watch?v=Sx7d7eGRSj0&t=9s){target="_blank"} 

  • Exercise

    Graphing in GGPlot
    https://bit.ly/2Gqjfj4

    Multiple variable in a graph Geom_Line, Geom_point, Geom_bar How to alter the colors in a chart.

Quiz - Math and R

TK TK TK Test Canvas: Excel Quiz

Read Before Next Class

Wong, Dona M. The Wall Street Journal Guide to Information Graphics. 
Ch. 5: Charting Your Course

Albert Cairo, "The Functional Art," Principles of Data Visualization.

http://learn.r-journalism.com/en/wrangling/dplyr/dplyr/


Week #5: Dplyr Bootcamp

Monday, Sept. 21

Agenda

Review Assignment #1   
Assignment1_KEY_StaticGraphic_2_9.R    
Dplyr bootcamp. 
IRE Conference
  • **Dplyr Presentation*

    Five basic verbs filter() select() arrange() mutate() summarize() plus group_by()

  • Pipes - a Much-Used Command to Link Filters, Functions

    pipe %>% CMD + Shift + M

    Pipes are a way of chaining commands.
    object %>% operation() —> result

    Presentation from Bob Rudis on Writing Readable Code with Pipes, delivered at the rstudio::conf 2017.
    https://www.rstudio.com/resources/videos/writing-readable-code-with-pipes/

  • Key Concepts - Moving Forward:

    Dplyr: Filters, Grouping, Sorting, pipes %>%
    Pipe shortcut = CMD + SHIFT + M Basic data visualization
    Tidyverse

  • Exercise:

    DPLYR BOOT CAMP 5th Lesson 8-21-2020.Rmd

    Make a Dplyr Cheat Sheet, Hand in on Blackboard by 11:59 pm Monday.

  • You are in Dplyr bootcamp!

Read Before Next Class

Machlis: Ch. 8 Analyze data by groups
Transforming and Analyzing Data dplyr.pdf, Andrew Ba Tran, Washington Post
Dplyr - Andrew Ba  Tran - pipes-dplyr.pdf

Wednesday, Sept. 23

Agenda

IRE Conference
Dplyr boot camp!!!
There is nothing else. Focus!!!
  • DPLYR

    DPLYR BOOT CAMP 5th Lesson 8-21-2020.Rmd

Code in Fun

Code in Fun

Notes: How Do I?

https://smach.github.io/R4JournalismBook/HowDoI.html

Read Before Next Class

Machlis 
Ch. 13 Date calculations
Dealing-with-dates.pdf by Andrew Ba Tran  

https://github.com/profrobwells/Data-Analysis-Class-Jour-405v-5003/blob/master/Readings/dealing-with-dates.pdf


Week #6: Dealing With Dates

Monday Sept. 28

Agenda

Lubridate
Review Ch. 13 Machlis
Review Tran and Lubridate

Exercises

–Using Lubridate

  The exercise: **TO BE UPDATED**   
  Lubridate_Intro_Feb_20.R

https://bit.ly/2H07YpX Lubridate vignette
browseVignettes(“lubridate”)

  What we will produce    

Read Before Next Class

Machlis
Ch. 9 Graphing by Group
Ch. 11 Maps in R

Wednesday Sept. 30

Agenda

Lubridate

Exercises

**TO BE UPDATED**
Key to Lubridate Questions in Exercise:    

https://bit.ly/2BO92d1

Read Before Next Class

Machlis
Ch. 11 Maps in R

Visual Narrative Tricks by Albert Cairo

https://www.youtube.com/watch?v=TSGaueL4Ggk


Week #7: Maps in R

Monday Oct. 5

Agenda

  Mapping
  

Exercises

      Mapping Demo: Map Demo 6th Lesson 8-22-2020.Rmd
      Mapping Exercise from Machlis book, Ch. 11      

https://bit.ly/2VXCSU2

Read Before Next Class

Joining Dataframes in R  

https://www.youtube.com/watch?v=gLg4D9bMIyc&t=13s


Wednesday Oct. 7

Agenda

  Maps
  Andrew Ba Tran - Mapping

https://nicar.r-journalism.com/docs/ http://learn.r-journalism.com/en/mapping/
Data Cleaning
Disaggregating variables for summation

Read Before Next Class

Ch. 7 Two or more data sets 

Data Wrangling

http://learn.r-journalism.com/en/wrangling/


Week #8: Maps in R

Monday, Oct. 12

Agenda

  Video of Machlis mapping  

https://www.youtube.com/watch?v=HFJOV5XaU_U See R script: Maps in R March 24 2019 https://bit.ly/2FvZDrB

Assignment 2. Graphic with Multiple Data Sources. Due Oct 12

Students will use R Studio to gather, analyze and visualize Arkansascovid data by demographic for       Arkansas using Census data or school district data. Results will be posted on GitHub. 600 Word          story. Data dictionary required
Students will produce publication-ready graphics from data. 
With the assigned dataset, XXXX.csv, students will produce the following tables:
Most common words used in data text field
Based on this information, write a 600 word story, following AP style, that describes potential         newsworthy trends. 
By 11:59 pm., you will submit the following on Blackboard:  
XXX charts in .jpeg format: 1) Common words 2) Common hashtags 3) Date trend chart
One Google Doc with your findings, 600 words. Append at the end a brief data dictionary.
An R script with the coding that shows how you loaded and cleaned the data and produced the charts

Read Before Next Class

Machlis: Ch. 14 Integrate R With Your Storytelling Using R Markdown

Wednesday Oct. 14: Mapping

Agenda

  Adapt Machlis - Maps in R Ch 11 exercise for Median Income in Arkansas
  Maps in R March 24 2019 - Using the Census API

https://bit.ly/2FvZDrB

  Building a Census tract map
  

We will build this

TK TK Images/Income by Census Tract Washington Co.png)    

Exercises

Map Demo: Interactive Map.R
Adapt Machlis - Maps in R Ch 11 exercise for Median Income in Arkansas
![You will make this in the class]TK TK Images/ARmed_income3-22-19.png

Read Before Next Class

Machlis Ch. 15 Simple Web scraping
APIs - basics   

https://medium.com/@LewisMenelaws/a-beginners-guide-to-web-apis-and-how-they-will-help-you-23923a0da450


Week #9: APIs

Monday Oct. 19

Agenda

  Sign up for a census key:

https://api.census.gov/data/key_signup.html What is an API?

Read Before Next Class

Machlis Ch. 12 Putting it all Together: R on Election Day

A gentle introduction to APIs for data journalists:   

https://trendct.org/2016/12/29/fetching-airport-delays-with-python-a-gentle-guide-to-apis-for-journalists/

tran mapping-census-data.html

Wednesday, Oct. 21 APIs

Agenda

APIs
Census API Exercise
tran mapping-census-data.rmd

Read Before Next Class

Machlis Chs. 17 & 18
Census Reporter to look up tables

https://censusreporter.org/


Week #10: APIs

Monday Oct 26

Agenda

  Twitter API
  Twitter Scraping

Exercises

Twitter analysis of Trump Tweets 

http://varianceexplained.org/r/trump-tweets/

Twitter historical API

https://developer.twitter.com/en/docs/tutorials/choosing-historical-api

Read Before Next Class

 Max Harlow Presentation on How to Use GitHub

https://docs.google.com/presentation/d/1MbltRcOerktc-E26HMDjYj0BO9CTubQWu1Z2bB9CpVY/edit#slide=id.g448ccc227721fe56_10

“Connecting the Dots” by Jacob Harris (2015) and discuss how people should or should not be represented through news visualizations.

Wednesday Oct 28: GitHub

Agenda

    GitHub Exercises
    

Resources:

This class is intended to teach you modern workflow techniques for coding. A centerpiece of that workflow is GitHub. This is a website with a system that allows you to collaborate with other programmers on coding projects. It manages versions of software code and is a very popular with the tech elite. 

Your GitHub account, which is public, represents an important professional image. Prospective employers and collaborators will look at your GitHub account.

Create a GitHub account.

https://github.com/

Simplified GitHub- GitHub Desktop

https://help.github.com/en/desktop

Exercises

See Basic GitHub 4-22-19.R

https://bit.ly/2UAMGTd

Installing Git for a Mac - Andrew Ba Tran

Read Before Next Class

--Follow this tutorial

https://guides.github.com/activities/hello-world/


Week #11: GitHub, Data Cleaning

Monday, Nov. 2

Agenda

  Using GitHub
  Set up R Project With GitHub
  Setting up an R Workflow

http://learn.r-journalism.com/en/publishing/workflow/r-projects/

Resources on GitHub

  GitHub flow   

https://guides.github.com/introduction/flow/

  GitHub Guides  

https://guides.github.com/

  Another GitHub guide  

https://andrewbtran.github.io/NICAR/2018/workflow/docs/03-integrating_github.html

Assignment 3: Interactive Map. Due Nov 2

  Students will use R Studio to build interactive maps of Arkansascovid occupational data in        Arkansas. Results will be posted on GitHub. Data dictionary required

Exercises

Extracting Text Strings from data

https://bit.ly/2X5V9jL

Joins in R: 

https://bit.ly/2OFnGJ6

Read Before Next Class

  TK 

Wednesday, Nov. 4

TAKE A DEEP BREATH, AMERICA

Agenda

Joins in R
Data Cleaning and Text Mining
 

Read Before Next Class

Tidy text mining

https://www.tidytextmining.com/tidytext.html#

Bad data visualizations. Data Translation. 

The Journalist as Programmer: A Case Study of The New York Times Interactive News Technology            Department 

http://isoj.org/wp-content/uploads/2016/10/ISOJ_Journal_V2_N1_2012_Spring.pdf

What is code?

http://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/


Week #12: Data Cleaning

Monday, Nov. 9

Agenda

Data Cleaning and Normalization
Review Assignment #3

Exercises

Splitting Hashtags 2-25-19.R   

https://bit.ly/2BQIE2i

R Markdown to distribute findings to Stanford, Feb 2020

https://profrobwells.github.io/HomelessSP2020/SF_311_Calls_UofA.html

Read Before Next Class

Julia Angwin, Terry Parris Jr., Surya Mattu. “Breaking the Black Box: What Facebook Knows About         You,” ProPublica, 2016; 

Nicholas Diakopolous, “Algorithmic Accountability,” Digital Journalism, 2014.

Wednesday, Nov. 11

Agenda

Data Cleaning and Normalization
Text Mining

Exercises

  Kavanaugh text mining story: Text analysis of Brett Kavanaugh’s opinion.  

http://www.storybench.org/bringing-textual-analysis-tools-to-judge-brett-kavanaughs-latest-opinion/

Read Before Next Class

  Text Mining with R

https://www.tidytextmining.com/ Ch. 1 The Tidy Text Format Ch. 2 Sentiment Analysis


Week #13: Text Mining

Monday, Nov. 16

Agenda

Text Mining Exercises
Coulter-AOC Twitter Feeds

Exercise

AOC-Coulter Data Mining Exercise: 
Coulter Tweet Analysis #2 Exercise

https://bit.ly/2TdY9Mv Key to the exercise:
https://bit.ly/2F3brlh

Read Before Next Class

  Text Mining with R

https://www.tidytextmining.com/ Ch 3 Analyzing word and document frequency: tf-idf


Wednesday, Nov. 18

Agenda

  Continue With Text Mining
  Turn your R cheatsheet into a PDF
  Turn your R cheatsheet into a web page on GitHub

Exercise

  Sentiment of Gov. Hutchison's Twitter Feed
  Bigrams: 

http://bit.ly/bigramz

Coulter Bigrams - Score
TK TK Images/Coulter bigram score.jpeg)

Week #14: Web scraping

Monday, Nov. 23

Agenda

Simple Web Scraping

Exercises

Web Scraping in R: Simple Web Scraper 

http://bit.ly/scrapeme

Read Before Next Class


Wednesday, Nov. 25

Agenda

OFF - THANKSGIVING

Week #15: Wrap Up

Monday, Nov. 30

Agenda

Flexible schedule. Reporting Updates

Assignment 4. Interactive Data Visualization. Due Nov 30.

Students will use R Studio to build interactive graphics / maps of Arkansascovid data by school district. 600 Word Story. Results will be posted on GitHub. Data dictionary required

Exercises

Splitting Hashtags 2-25-19.R   

https://bit.ly/2BQIE2i


Wednesday, Dec. 2

Agenda

 Flexible schedule. Reporting Updates

Week #16: Wrap Up

Monday, Dec. 7

Agenda

Flexible schedule. Reporting Updates

Important!

Course Evaluation
Please do me  a favor and evaluate this course. 
It's important to me and the department to get your thoughts
on what worked and what did not. 
If you think it is important, 
then please take five minutes to fill out the survey.

https://courseval.uark.edu/

Wednesday, Dec. 8

Agenda

  Zoom Party

Congratulations!

 We covered a lot this semester

Twitter Exercises

    Coulter Tweet Analysis #2 Exercise
    https://bit.ly/2TdY9Mv
    
    Twitter exploration exercise
    https://bit.ly/2Sqn1j1
    
    Return to Twitter Engagement
    http://bit.ly/2GParD5
    
    AOC Twitter feed   
    https://bit.ly/2Sqn1j1  
    
    Discuss Twitter Metadata     
    https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/user-object   
    
    Study Twitter meta data  
    https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object.html
    Look at this example: Ocasio.csv
    
    Twitter analysis of Trump Tweets
    http://varianceexplained.org/r/trump-tweets/
    
    Show Collins results  
  • Bots

      Bot or Not: Difficulty determining a bot on Twitter    
    
      An app that uses machine learning to guess if a Twitter account is a bot   
      https://www.r-bloggers.com/botrnot-an-r-app-to-detect-twitter-bots/    
      https://mikewk.shinyapps.io/botornot/    
    
      Article about Botometer   
      https://www.vox.com/technology/2018/4/9/17214720/pew-study-bots-generate-two-thirds-of-twitter-links   
    
      Stanford research paper on this topic    
      https://pdfs.semanticscholar.org/e219/6b47133c2191d380098744c13ba77133e625.pdf   

–30–